178 research outputs found

    A New Framework for the Use of Variant Interpretation Tools in Clinical Practice

    Get PDF
    Current ACMG/AMP guidelines for the use of sequence variants for genetic diagnosis and treatment permit the use of in silico predictors as Supporting evidence (PP3 and BP4 criteria). These criteria, however, lack quantitative support and leave clinicians and scientists without standards for applying these criteria, leading to large interpretation variability. To address this challenge, our team built upon previous work and introduced a novel criterion that can be used to calibrate any computational model or any other continuous-scale evidence on any variant type. We used it to estimate score intervals corresponding to the four strengths of evidence for pathogenicity and benignity for fourteen missense variant interpretation tools on a carefully assembled data sets of known pathogenic and benign variants. We found that most tools achieved the Supporting evidence level for both pathogenic and benign classification using newly established datadriven thresholds. Importantly, at appropriate score thresholds, several in silico methods can also provide Moderate and Strong evidence levels for a limited number of variants. Based on these findings, we provided recommendations for quantitative revisions of the PP3 and BP4 criteria within ACMG/AMP guidelines and the future assessment of in silico methods for clinical interpretation.Book of abstract: 4th Belgrade Bioinformatics Conference, June 19-23, 202

    Length-dependent prediction of protein intrinsic disorder

    Get PDF
    BACKGROUND: Due to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (≤30 residues). A probable cause is a length-dependent amino acid compositions and sequence properties of disordered regions. RESULTS: We proposed two new predictor models, VSL2-M1 and VSL2-M2, to address this length-dependency problem in prediction of intrinsic protein disorder. These two predictors are similar to the original VSL1 predictor used in the CASP6 experiment. In both models, two specialized predictors were first built and optimized for short (≤30 residues) and long disordered regions (>30 residues), respectively. A meta predictor was then trained to integrate the specialized predictors into the final predictor model. As the 10-fold cross-validation results showed, the VSL2 predictors achieved well-balanced prediction accuracies of 81% on both short and long disordered regions. Comparisons over the VSL2 training dataset via 10-fold cross-validation and a blind-test set of unrelated recent PDB chains indicated that VSL2 predictors were significantly more accurate than several existing predictors of intrinsic protein disorder. CONCLUSION: The VSL2 predictors are applicable to disordered regions of any length and can accurately identify the short disordered regions that are often misclassified by our previous disorder predictors. The success of the VSL2 predictors further confirmed the previously observed differences in amino acid compositions and sequence properties between short and long disordered regions, and justified our approaches for modelling short and long disordered regions separately. The VSL2 predictors are freely accessible for non-commercial use a
    corecore